Eigenvalue-Corrected Natural Gradient Based on a New Approximation
نویسندگان
چکیده
Using second-order optimization methods for training deep neural networks (DNNs) has attracted many researchers. A recently proposed method, Eigenvalue-corrected Kronecker Factorization (EKFAC) (George et al., 2018), proposes an interpretation of viewing natural gradient update as a diagonal and corrects the inaccurate re-scaling factor in Kronecker-factored eigenbasis. Gao al. (2020) considers new approximation to gradient, which approximates Fisher information matrix (FIM) constant multiplied by product two matrices keeps trace equal before after approximation. In this work, we combine ideas these propose Trace-restricted (TEKFAC). The method not only inexact under eigenbasis, but also effective damping technique (2020). We discuss differences relationships among approximations. Empirically, our outperforms SGD with momentum, Adam, EKFAC TKFAC on several DNNs.
منابع مشابه
A New Hybrid Conjugate Gradient Method Based on Eigenvalue Analysis for Unconstrained Optimization Problems
In this paper, two extended three-term conjugate gradient methods based on the Liu-Storey ({tt LS}) conjugate gradient method are presented to solve unconstrained optimization problems. A remarkable property of the proposed methods is that the search direction always satisfies the sufficient descent condition independent of line search method, based on eigenvalue analysis. The globa...
متن کاملEnhancing Eigenvalue Approximation by Gradient Recovery
The polynomial preserving recovery (PPR) is used to enhance the finite element eigenvalue approximation. Remarkable fourth order convergence is observed for linear elements under structured meshes as well as unstructured initial meshes (produced by the Delaunay triangulation) with the conventional bisection refinement.
متن کاملEnhancing eigenvalue approximation by gradient recovery on adaptive meshes
Gradient recovery has been widely used for a posteriori error estimates (see Ainsworth & Oden, 2000; Babuška & Strouboulis, 2001; Chen & Xu, 2007; Fierro & Veeser, 2006; Zhang, 2007; Zienkiewicz et al., 2005; Zienkiewicz & Zhu, 1987, 1992a,b). Recently, it has been employed to enhance the eigenvalue approximations by the finite-element method under certain mesh conditions (see Naga et al., 2006...
متن کاملA new gradient - corrected exchange functional
A new gradient-corrected exchange functional (G96) is introduced. While similar to Becke’s B88 functional, it is much simpler and its potential in ® nite systems is asymptotically unbounded. The mean absolute deviations of the B88 and G96 exchange energies from the corresponding Hartree-Fock values for the atoms H to Ar are 12 ± 5 and 8 ± 5 mE h , respectively. In combination with the LYP corre...
متن کاملtask-based language teaching in iran: a mixed study through constructing and validating a new questionnaire based on theoretical, sociocultural, and educational frameworks
جنبه های گوناگونی از زندگی در ایران را از جمله سبک زندگی، علم و امکانات فنی و تکنولوژیکی می توان کم یا بیش وارداتی در نظر گرفت. زبان انگلیسی و روش تدریس آن نیز از این قاعده مثتسنی نیست. با این حال گاهی سوال پیش می آید که آیا یک روش خاص با زیر ساخت های نظری، فرهنگی اجتماعی و آموزشی جامعه ایرانی سازگاری دارد یا خیر. این تحقیق بر اساس روش های ترکیبی انجام شده است.پرسش نامه ای نیز برای زبان آموزان ...
ذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Asia-Pacific Journal of Operational Research
سال: 2023
ISSN: ['1793-7019', '0217-5959']
DOI: https://doi.org/10.1142/s0217595923400055